Declarative Knowledge Discovery in Industrial Databases

نویسنده

  • Stephen Muggleton
چکیده

Industry is increasingly overwhelmed by large-volume-data. For example, the pharmaceutical industry generates vast quantities of data both internally as a side-eeect of screening tests and combinato-rial chemistry, as well as externally from sources such as the human genome project. Industry is also becoming predominantly knowledge-driven. Increased understanding not only improves products, but is also central in market assessment and strategic decision making. From a computer science point of view, the knowledge requirements within industry often give higher emphasis to \knowing that" (declarative or descriptive knowledge) rather than \knowing how" (procedural or prescriptive knowledge). Mathematical logic has always been the preferred representation for declarative knowledge and thus knowledge discovery techniques are required which generate logical formulae from data. Inductive Logic Programming (ILP) is such a technique. Logic programs provide a powerful and exible representation for constraints , grammars, plans, equations and temporal relationships. New techniques developed within the 1990s allow general-purpose ILP systems to construct logic programs from a mixture of raw data and encoded domain knowledge. This paper will review the results of the last few years' academic pilot studies involving the application of ILP to problems in the pharmaceutical, telecommunications and 1 Stephen Muggleton has recently accepted an invitation to take up the new Chair of Machine Learning at the University of York. automobile industries. While predictive accuracy is the central performance measure of data analytical techniques which generate procedural knowledge (neural nets, decision trees, etc.), the performance of an ILP system is determined both by accuracy and degree of insight provided. ILP hypotheses can be easily stated in English and often automatically exempliied pictorially. This allows cross-checking with other relevant domain knowledge. In several of the comparative trials presented ILP systems provided signiicant insights where other data analysis techniques do not. The scene appears now to be set for commercially-oriented application of ILP in industry.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)

Decision support systems (DSSs) are prevalent information systems for decision making in many competitive business environments. In a DSS, decision making process is intimately related to some factors which determine the quality of information systems and their related products. Traditional approaches to data analysis usually cannot be implemented in sophisticated Companies, where managers ne...

متن کامل

Declarative Rules for Annotated Expert Knowledge in Change Management

In this paper, we use declarative and domain–specific languages for representing expert knowledge in the field of change management in organisational psychology. Expert rules obtained in practical case studies are represented as declarative rules in a deductive database. The expert rules are annotated by information describing their provenance and confidence. Additional provenance information f...

متن کامل

From Facts to Rules to Decisions: An Overview of the FRD-1 System

One of the central goals of knowledge discovery in databases is to produce knowledge useful for decision making. However, the form in which knowledge can be easily discovered is often different from the form in which its is most readily used for decision making. This paper describes a methodology in which knowledge is discovered in declarative form (decision rules). When it is needed for decisi...

متن کامل

An Ilp - Based Concept Discovery System for Multi - Relational Data Mining

AN ILP-BASED CONCEPT DISCOVERY SYSTEM FOR MULTI-RELATIONAL DATA MINING Kavurucu, Yusuf Ph.D., Department of Computer Engineering Supervisor : Asst. Prof. Dr. Pınar Şenkul July 2009, 118 pages Multi Relational Data Mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. However, as patter...

متن کامل

Decomposition in Data Mining: An Industrial Case Study

Data mining offers tools for discovery of relationships, patterns, and knowledge in large databases. The knowledge extraction process is computationally complex and therefore a subset of all data is normally considered for mining. In this paper, numerous methods for decomposition of data sets are discussed. Decomposition enhances the quality of knowledge extracted from large databases by simpli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997